Decision Tree#
See the backing repository for Decision Tree here.
Summary#
A supervised decision tree. This is a recursive partitioning method where the feature space is continually split into further partitions based on a split criteria. A predicted value is learned for each partition in the “leaf nodes” of the learned tree. This is a light wrapper to the decision trees exposed in scikit-learn. Single decision trees often have weak model performance, but are fast to train and great at identifying associations. Low depth decision trees are easy to interpret, but quickly become complex and unintelligible as the depth of the tree increases.
How it Works#
Christoph Molnar’s “Interpretable Machine Learning” e-book [1] has an excellent overview on decision trees that can be found here.
For implementation specific details, scikit-learn’s user guide [2] on decision trees is solid and can be found here.
Code Example#
The following code will train an decision tree classifier for the breast cancer dataset. The visualizations provided will be for both global and local explanations.
from interpret import set_visualize_provider
from interpret.provider import InlineProvider
set_visualize_provider(InlineProvider())
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score
from interpret.glassbox import ClassificationTree
from interpret import show
seed = 42
np.random.seed(seed)
X, y = load_breast_cancer(return_X_y=True, as_frame=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=seed)
dt = ClassificationTree(random_state=seed)
dt.fit(X_train, y_train)
auc = roc_auc_score(y_test, dt.predict_proba(X_test)[:, 1])
print("AUC: {:.3f}".format(auc))
AUC: 0.957
show(dt.explain_global())
show(dt.explain_local(X_test[:5], y_test[:5]), 0)
Further Resources#
Bibliography#
[1] Christoph Molnar. Interpretable machine learning. Lulu. com, 2020.
[2] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, and others. Scikit-learn: machine learning in python. the Journal of machine Learning research, 12:2825–2830, 2011.
API#
ClassificationTree#
- class interpret.glassbox.ClassificationTree(feature_names=None, feature_types=None, max_depth=3, **kwargs)#
Classification tree with shallow depth.
Initializes tree with low depth.
- Parameters:
feature_names – List of feature names.
feature_types – List of feature types.
max_depth – Max depth of tree.
**kwargs – Kwargs sent to __init__() method of tree.
- explain_global(name=None)#
Provides global explanation for model.
- Parameters:
name – User-defined explanation name.
- Returns:
An explanation object, visualizing feature-value pairs as horizontal bar chart.
- explain_local(X, y=None, name=None)#
Provides local explanations for provided instances.
- Parameters:
X – Numpy array for X to explain.
y – Numpy vector for y to explain.
name – User-defined explanation name.
- Returns:
An explanation object.
- fit(X, y)#
Fits model to provided instances.
- Parameters:
X – Numpy array for training instances.
y – Numpy array as training labels.
- Returns:
Itself.
- predict(X)#
Predicts on provided instances.
- Parameters:
X – Numpy array for instances.
- Returns:
Predicted class label per instance.
- predict_proba(X)#
Probability estimates on provided instances.
- Parameters:
X – Numpy array for instances.
- Returns:
Probability estimate of instance for each class.
- score(X, y, sample_weight=None)#
Return the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Test samples.
y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True labels for X.
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.
- Returns:
score – Mean accuracy of
self.predict(X)w.r.t. y.- Return type:
float
RegressionTree#
- class interpret.glassbox.RegressionTree(feature_names=None, feature_types=None, max_depth=3, **kwargs)#
Regression tree with shallow depth.
Initializes tree with low depth.
- Parameters:
feature_names – List of feature names.
feature_types – List of feature types.
max_depth – Max depth of tree.
**kwargs – Kwargs sent to __init__() method of tree.
- explain_global(name=None)#
Provides global explanation for model.
- Parameters:
name – User-defined explanation name.
- Returns:
An explanation object, visualizing feature-value pairs as horizontal bar chart.
- explain_local(X, y=None, name=None)#
Provides local explanations for provided instances.
- Parameters:
X – Numpy array for X to explain.
y – Numpy vector for y to explain.
name – User-defined explanation name.
- Returns:
An explanation object.
- fit(X, y)#
Fits model to provided instances.
- Parameters:
X – Numpy array for training instances.
y – Numpy array as training labels.
- Returns:
Itself.
- predict(X)#
Predicts on provided instances.
- Parameters:
X – Numpy array for instances.
- Returns:
Predicted class label per instance.
- score(X, y, sample_weight=None)#
Return the coefficient of determination of the prediction.
The coefficient of determination \(R^2\) is defined as \((1 - \frac{u}{v})\), where \(u\) is the residual sum of squares
((y_true - y_pred)** 2).sum()and \(v\) is the total sum of squares((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a \(R^2\) score of 0.0.- Parameters:
X (array-like of shape (n_samples, n_features)) – Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape
(n_samples, n_samples_fitted), wheren_samples_fittedis the number of samples used in the fitting for the estimator.y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True values for X.
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.
- Returns:
score – \(R^2\) of
self.predict(X)w.r.t. y.- Return type:
float
Notes
The \(R^2\) score used when calling
scoreon a regressor usesmultioutput='uniform_average'from version 0.23 to keep consistent with default value ofr2_score(). This influences thescoremethod of all the multioutput regressors (except forMultiOutputRegressor).